Querying Xml Document Collections
نویسندگان
چکیده
In this paper we describe a query interface towards XML document collections. External schema annotation in RDF contains information used to dynamically build the interface tailored to the user’s characteristics and to the document structure, as described by its XML Schema. The interface makes the user aware of structure semantics, so supporting her/him in formulating semantically correct queries. In preparing the query, the user can have access to ontologies or linguistic resources via web services. Queries are prepared in an intermediate format that can be translated into different search engines. The architecture is fully compliant with web design principles and standards.
منابع مشابه
XPath Extension for Querying Concurrent XML Markup∗
XPath is a language for addressing parts of an XML document. It is used in many XML query languages and it can be used by itself for querying XML documents. While XPath is, in general, efficient for querying individual XML documents, it lacks the features for querying over collections of documents or joining parts of the same document. As the amount of complex document-centric XML data is conti...
متن کاملApproximate Tree Embedding for Querying XML Data
Querying heterogeneous collections of data-centric XML documents requires a combination of database languages and concepts used in information retrieval, in particular similarity search and ranking. In this paper we present an approach to find approximate answers to formal user queries. We reduce the problem of answering queries against XML document collections to the well-known unordered tree ...
متن کاملQuerying Structured XML Document Collections
The number of XML document collections is increasing, and it’s important to effectively query them. Document semantics is in both the text and the structure. In this paper we describe a query interface towards XML document collections. The interface is automatically tailored to the document structure, as described by its XML Schema. External schema annotation in RDF contains information used to...
متن کاملDescribeX: A Framework for Exploring and Querying XML Web Collections
DescribeX: A Framework for Exploring and Querying XML Web Collections Flavio Rizzolo Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2008 The nature of semistructured data in web collections is evolving. Even when XML web documents are valid with regard to a schema, the actual structure of such documents exhibits significant variations across collections for s...
متن کاملDocument Structure Matching for Heterogeneous Corpora
Querying heterogeneous XML document collections is an open problem. This will require building some sort of correspondence between the DTD of the different sources. We consider here the problem of matching the structure of XML documents from different sources. We introduce for that a stochastic structured document model and describe preliminary experiments performed on the INEX collection.
متن کامل